Calculus of Fuzzy Semantic Typing for Qualitative Analysis of Text

نویسندگان

  • Pero Subasic
  • Alison Huettner
چکیده

Statistical approaches to text mining can be enhanced and improved through the qualitative representation of free text – ideally, a representation which accommodates ambiguity and imprecision. We introduce a specialized lexicon that assigns semantic categories to words, together with numeric values for centrality and intensity within each category. From this lexicon, we automatically generate an additional set of resources to implement some of the common operations of text mining – profiling, querying, and query/profile expansion and compression – in qualitative domains. We exploit the hierarchical structure of free text (i.e., sentence/ paragraph/ document) and develop a set of operators whose arguments are fuzzy representations ("profiles") of text at any hierarchical level. Various operators compute the centrality and intensity of categories within a profile, a profile's overall intensity, and the cardinality and fuzziness of a profile; others are used in profile merging, profile expansion or compression, and discovery of related categories from a profile. We address the meaning and modes of deployment of these operators using practical examples. Finally, we discuss the utility of fuzzy typing for various tasks, such as "qualitative browsing" and similarity estimates. We discuss how the existing approach can be enhanced using automatic lexicon expansion and information extraction techniques. We offer a practical software demonstration with several visualization examples, illustrating the power of the proposed operators in affect analysis of news reports and movie reviews.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Affect analysis of text using fuzzy semantic typing

We propose a novel, convenient fusion of natural language processing and fuzzy logic techniques for analyzing the affect content in free text. Our main goals are fast analysis and visualization of affect content for decision making. The main linguistic resource for fuzzy semantic typing is the fuzzy-affect lexicon, from which other important resources—the fuzzy thesaurus and affect category gro...

متن کامل

Non-Newtonian Fuzzy numbers and related applications

Although there are many excellent ways presenting the principle of the classical calculus, the novel presentations probably leads most naturally to the development of the non-Newtonian calculus. The important point to note is that the non-Newtonian calculus is a self-contained system independent of any other system of calculus. Since this self-contained work is intended for a wide audience, inc...

متن کامل

SOME FUNDAMENTAL RESULTS ON FUZZY CALCULUS

In this paper, we study fuzzy calculus in two main branches differential and integral.  Some rules for finding limit and $gH$-derivative of $gH$-difference, constant multiple of two fuzzy-valued functions are obtained and we also present fuzzy chain rule for calculating  $gH$-derivative of a composite function.  Two techniques namely,  Leibniz's rule and integration by parts are introduced for ...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000